Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

نویسندگان

  • Xiao-Lin Wu
  • Jiaqi Xu
  • Guofei Feng
  • George R. Wiggans
  • Jeremy F. Taylor
  • Jun He
  • Changsong Qian
  • Jiansheng Qiu
  • Barry Simpson
  • Jeremy Walker
  • Stewart Bauck
چکیده

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Imputation of parent-offspring trios and their effect on accuracy of genomic prediction using Bayesian method

The objective of this study was to evaluate the imputation accuracy of parent-offspring trios under different scenarios. By using simulated datasets, the performance Bayesian LASSO in genomic prediction was also examined. The genome consisted of 5 chromosomes and each chromosome was set as 1 Morgan length. The number of SNPs per chromosome was 10000. One hundred QTLs were randomly distributed a...

متن کامل

Optimal Design of Axial Flux Permanent Magnet Synchronous Motor for Electric Vehicle Applications Using GAand FEM

Axial Flux Permanent Magnet (AFPM) machines are attractive candidates for Electric Vehicles (EVs) applications due to their axial compact structure, high efficiency, high power and torque density. This paper presents general design characteristics of AFPM machines. Moreover, torque density of the machine which is selected as main objective function, is enhanced by using Genetic Algorithm (GA) a...

متن کامل

Comparing Different Marker Densities and Various Reference Populations Using Pedigree-Marker Best Linear Unbiased Prediction (BLUP) Model

In order to have successful application of genomic selection, reference population and marker density should be chosen properly. This study purpose was to investigate the accuracy of genomic estimated breeding values in terms of low (5K), intermediate (50K) and high (777K) densities in the simulated populations, when different scenarios were applied about the reference populations selecting. Af...

متن کامل

Optimum Design of a Three-Phase Permanent Magnet Synchronous Motor for industrial applications

Permanent Magnet Synchronous Motors (PMSMs) have been widely used in many industrial applications. In This paper a new method for multi objective optimal design of a permanent magnet synchronous motor (PMSMs) with surface mounted permanent magnet rotor is presented to achieve maximum efficiency and power density using a Bees algorithm for industrial applications. The objective function is a...

متن کامل

Development and Application of High-density SNP Arrays in Genomic Studies of Domestic Animals

In the past decade, there have been many advances in whole-genome sequencing in domestic animals, as well as the development of “next-generation” sequencing technologies and high-throughput genotyping platforms. Consequently, these advances have led to the creation of the high-density SNP array as a state-of-the-art tool for genetics and genomics analyses of domestic animals. The emergence and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2016